Invariant Action Effect Model for Reinforcement Learning
نویسندگان
چکیده
Good representations can help RL agents perform concise modeling of their surroundings, and thus support effective decision-making in complex environments. Previous methods learn good by imposing extra constraints on dynamics. However, the causal perspective, causation between action its effect is not fully considered those methods, which leads to ignorance underlying relations among effects transitions. Based intuition that same always causes similar different states, we induce such taking invariance states as relation. By explicitly utilizing invariance, this paper, show a better representation be learned potentially improves sample efficiency generalization ability policy. We propose Invariant Action Effect Model (IAEM) capture effects, where an represented residual from neighboring states. IAEM composed two parts: (1) new contrastive-based loss effects; (2) individual provides self-adapted weighting strategy tackle corner cases does hold. The extensive experiments benchmarks, i.e. Grid-World Atari, preserve effects. Moreover, with invariant effect, accelerate learning process 1.6x, rapidly generalize environments fine-tuning few components, outperform other dynamics-based 1.4x limited steps.
منابع مشابه
Scale invariant value computation for reinforcement learning
Natural learners must compute an estimate of future outcomes that follow from a stimulus in continuous time. Critically, the learner cannot in general know a priori the relevant time scale over which meaningful relationships will be observed. Widely used reinforcement learning algorithms discretize continuous time and use the Bellman equation to estimate exponentially-discounted future reward. ...
متن کاملLyapunov-Constrained Action Sets for Reinforcement Learning
Lyapunov analysis is a standard approach to studying the stability of dynamical systems and to designing controllers. We propose to design the actions of a reinforcement learning (RL) agent to be descending on a Lyapunov function. For minimum cost-to-target problems, this has the theoretical benefit of guaranteeing that the agent will reach a goal state on every trial, regardless of the RL algo...
متن کاملAction Branching Architectures for Deep Reinforcement Learning
Discrete-action algorithms have been central to numerous recent successes of deep reinforcement learning. However, applying these algorithms to high-dimensional action tasks requires tackling the combinatorial increase of the number of possible actions with the number of action dimensions. This problem is further exacerbated for continuous-action tasks that require fine control of actions via d...
متن کاملReinforcement Learning with Action Discovery
The design of reinforcement learning solutions to many problems artificially constrain the action set available to an agent, in order to limit the exploration/sample complexity. While exploring, if an agent can discover new actions that can break through the constraints of its basic/atomic action set, then the quality of the learned decision policy could improve. On the flipside, considering al...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2022
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v36i8.20913